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7.3 THE DIGITAL STEP EDGE 


Robert M. Heral ick 

Oepertaests of Electrical Engineering and Computer Science 
Virginia Polytechnic Institute and State University 
Blacksburg> Virginia 24061 


Abstract 

We use the facet model to accompl ish step edge detection* 
The essence of the facet model is that any analysis ma^.e on the 
basis of the pixel values in some neighborhood has its final 
authoritative interpretation relative to the underlying grey tone 
intensity surface of v^hich the neighborhood pixel values are 
observed noisy samples . 

Pixels which are part of regions have simple grey tone 
intensity surfaces over their areas* Pixels which have an edge 
in them have complex grey tone intensity surfaces over their 
areas* Specif ically. an edge moves through a pixel if and only 
if there is some point in the pixel's area having a zero crossing 
of the second directional derivative trken in the direction of a 
non-zero gradient at the pixel's center. 

To determine whether or not a pixel should be marked ns a 
step edge pixel, its underlying grey tone intensity surface most 
be estimated on the basis of the pixels ^n its neighborhood* For 
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this* ve use a fuuctioual fora consisting of a linear combination 
of the tensor products of discrete ortbogonal polynomials of up 
to degree three. The appropriate directional derivatives are 
easily computed from this kind of a function. 

Upon comparing the performance of this zero crossing of 
second directional derivative operator with Prewitt gradient 
operator and the Mar r-Eildr eth zero crossing of Laplacian 
operator, we find that it is the best performer and is followed 
by the Prewitt gradient operator. The Mar r-Hildr e th zero- 
crossing of Laplacian operator performs the worst. 
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I . Int rodttc t io n 

What is an edge in a digital image? The first intuitive 
notion is that a digital edge occurs on the boundary between two 
pixels when the respective brightness values of the two pixels 
are significantly different. Significantly different nay depend 
upon the distribution of brightness values around each of the 
pixels • 

We often point to a region on an image and say this region 
is brighter than its surrounding area, meaning that the mesi of 
the brightness values of pixels inside the region is brighter 
than the mean of the brightness values outside the region. 
Having noticed this we would then say that an edge exists between 
each pair of neighboring pixels where one pixel is inside the 
brighter region and the other is outside the region. Such edges 
are referred to as step edges. 

Step edges are not the only hind of edge. If we scan 
through u region in a left right manner observing the brightness 
values stealily increasing and then after a certain point observe 
that the brightness values are steadily decreasing we are likely 
to sav th there is an edge at the point of change from 
increasing to decreasing brightness values. Such edges are 
called roof edges. 

It is, therefore, clear from our use of the word edge that 
edge refers to places in the image where there appears to be a 
j ump in brightness value or a local extrema in brightness value 
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derivative. Junps in brightness values are the kinds of edges 
originally detected by Roberts (1965). Relative, extreaa of first 
derivative in a one dimensional form is used by Ehrich and 
Schroeder (1981) and in an isotropic two-dimensional suboptimal 
form by Karr and Hildreth (1980) * 

In some sense this summary statement about edges is quite 
revealing since in a discrete array of brightness values there 
are jumps. in the literal sense. between neighboring brightness 
values if the brightness values are different. even if only 
slightly different. Perhaps more to the heart of the matter, 
there exists no definition of derivative for a discrete array of 
brightness values* The only way to interpret jumps in value or 
local extrema of derivatives when referring to a discrete array 
of values is to assume that the discrete array of values comes 
about as some kind of sampling of a real-valued function defined 
on a bounded and connected subset of the real plane R . The 
jumps in value or extrema in derivative really must refer to 
points of high first derivative of f and to points of relative 
extrema in the second derivatives of f. Edge detection must then 
involve fitting a function to the sample values. Prewitt (1970). 
was the first to suggest the fitting idea* Heuckel (1971. 1973). 
Brooks (1978). Haralick (1980). Haralick and Watson (1981). 
Morgenthaler and Rosenfold (1981) » Zucker and Hummel (1979). and 
Morgenthaler (1981) all use the surface fit concept in 
determining edges* 
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Edge finders should then regard the digital picture function 
as a sampling of the underlying function f, where some kind of 
random noise has been added to the true function values* To do 
this« the edge finder must assume some kind of parametric form 
for the underlying function f< use the sampled brightness values 
of the digital picture function to estimate the parameters* and 
f inal ly mane decisions regarding the locations of discontinuities 
and the locations of relative extrema of partial derivatives 
based on the estimated values of the parameters. 

Of course* it is impossible to determine the true locations 
of discontinuities in value or relative extrema in derivatives 
directly from a sampling of the functions. The locations are 
estimated by function approximation. Sharp discontinuities can 
reveal themselves in high values for estimates of first partial 
derivatives. Fwlative extrema in first directional derivative 
can reveal themselves as xero-crossings of the second directional 
derivative. Thus* if ve assume that the first and second partial 
derivatives of any possible underlying image function have known 
bounds* then any estimated first or second order partials which 
exceed these known bounds must be due to uiscont iuuities in value 
or in derivative of the underlying functfon* This is basis for 
the gradient magnitude and Laplaciau magnitude edge detectors. 
However* edges can be weak but well localized* Such edges* as 
well as the strong edges just discussed* manifest themselves as 
local extrema of the ueriva ve taken across the edge. This idea 
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for edges is tlie basis of tbe edge detector discussed bere. 

In this paper* we assume that in each neighborhood of the 
image the underlying function f takes the parametric form of a 
polynomial in the row and column coordinates and that the 
sam^ding producing the digital picture function is a regular 
equal interwal grid sampling of the square plane whicu is the 
domain of f. As Just mentioned* we place edges not at locations 
of high gradient* but at locations ul spatial gradient maxima. 
More precisely* a pixel is marked as an edge pixel if in the 
pixel's immediate area there is a zero crossing of the second 
directional derivative taken in the direction of the gradient. 
Thus this kind of edge detector will respond to weak but 
spatially peaked gradients. 

The underlying functions from which the directional 
derviatives are computed are easy to represent as linear 
comhinations of the polynomials in any polynomial basis set. 
That polynomial basis set which permits the independent 
estimation of each coefficient would be the easiest to use. Such 
a polynomial basis set is the discrete orthogonal polynomial 
basis set. 

Section II discusses the polynomials. In section II. 1 we 
discuss how to construct the one dimensional family of discrete 
orthogonal polynomials. In section II.2 we discuss how arbitrary 
two dimensional polynomials can be composed as linear 
combinations of the tensor products of one .mens ional discrete 
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orthofonal poxynosials. In section II.3« we discuss how the 
discretely sampled data values are use to estimate the 
coefficients of the linear combinations: coefficient estimates 
for exactly fitting or estimates for least square fitting are 
calculated as linear combinat io:is of the sampled data values* 

Having used the pixel values in a neighborhood to estimate 
the underlying polynomial function we can now determine the value 
of the partial derivatives at any location in the neighborhood 
and use those values in edge finding. Having to deal with 
partials in both the row and column directions makes using these 
derivatives a little more complicated than using the simple 
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II. Xk£ Eli&XAli, gFtAaxgayl 

These polynomials are sometimes called the discrete 
Chebychev polynomials (Beckmann* 1973)* In this section we c' a 
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bow to coastract them fox oae or two wariablos aad bow to nse 
tb«m in fitting data. 

II. 1 Bijglgii Eolynoaial Conit£,ng ^ i £n Taobaiane 

Let tbe index aet R be syaaetric in tbe aenae that xeR 
iapliea -r*.R. Let P (r) be tbe n^^ order polyaoaial. We 

fl 

define tbe c'^natrnotion technique for diacrete ortbogonal 
polynoaiala iteratively. 

Define Pq(x) ■ 1. 

Snppoae have been defined In general, 

Pjj(r) ■ r*^ + *p-l**^ ^ + ... + a^r + a^. ^.ust be ortbogonal 

to each poiynoaial Pq ( r ),..., Pj^_2 ^ ^ • Hence, we anst 1 *ve tbe n 
equations 


I 

r cR 


P^(r) (r’^ 


+ 



+ 


+ aj^r + a^) « 0, k*0 


n-1 (1) 


Tbeae equations are linear equationa in tbe < .ikx.own a., ..,a . 

U ll* X 

and are eaaily aolved by atandard techniques. 

Tbe first five poiynoaial functions foraulaa are 
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PflCt) - 1 


Pj^(r) » r 


Pj(r) • - lij/Po 


P2<r) » 


p 4 (r; 


5^0»*4’P2 


wher 


“k - I 

s eR 


1 1. 2 Two Diaens ional Dis «^ rete Orthotonal ^ vnotti^ 1 > 

Two dimensional discrete orthogonal polynomials can be 
cxeated from two sets of one dimensional discrete orthogonal 
polynomials by taking tensor products. Let R anu C be index ^ets 
satisfyin;^ the symmetry condition rcR implies -reR and ceC 
implie -ceC. Let {P^ ( r ) , . . » ( r ) ) be a set of discrete 

polynomials on E. Let (Qq ( c >»•«•« Qj|( c ) ] be a set of discrete 
polynomials on C Then the set 

iPQ(r)QQ(c),...,Pj^(r)Q^(c)*...*Fj^(r)<ljj(ci) is a set of discre«,e 
polynomials on 2IC . 
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The proof of this fact is easy. Consider 

is orthogonal to P (r)Q (c). when n i or m 

n n 


) } P^(r)Q,(c)P^(r)Q„(c) 

re& ceC 


= } Pi(r)P^(r) } Qj(c)Q„(c). 

reR ceC 


Since n i or m ft j one or other of the sums 


whether ( r ) ( c ) 
# j . Then 


must be zero . 
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I ft All SJlI CiUtaig Orthoaoaal Ealiassi&l 111 


1 - 1 / 2 . 1/21 { 1 . 

(- 1 . 0 , 11 { 1 . 

(-2/3. -1/2, 1/2, 3/21 (1. 

(- 2 . - 1 . 0 . 1 . 21 ( 1 . 

r 


(-1 .0 . 1 ) I (-1 .0 ,11 (1 


rl 

r, - 2/31 

r. - 5/4, - 41/20rl 

r, - 2. - 17/3, 

3r^ + 72/331 

2 2 
,r ,c . r - 2/3 , rc , c - 2/3 

i(c^ - 2/3) . c(r^ - 2/3) . 

(r^ - 2/3) (c^ - 2/3)1 


Figure 1 and 2 show some of the window masks used for the 3x3 
and 4 X 4 cases. 
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Figure 1 illustrite^ the > mesks ^or the 3x3 window 
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Figure 2 illustrates the aasks used to obtain the coefficients 
of all polynomials up to the quadratic ones for a 4x4 window. 
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I I . 3 Fi t t ing Dt t a With Discrete Orthogonal Polynoaisls 

Let an index set R with the syanetry property rs£ implies 
-rcR be given* Let the number of elements in R be N* Using 
the construction technique, we may construct the set 
{PQ(r) , . . . ,Pj^_j^{r) 1 of discrete orthogonal polynomials over R. 

For each reR, let a data value d(r) be observed. The 
exact fitting problem is to determine coefficients > • • * > 
such that 

N--1 

d(r) » y *«P«(**> 
n n 

n=0 

The orthogonality property makes the determination of the 
coefficients particularly easy. To find the value of some 
coefficient, say a^, multiply both sides of the equation by Pj^(r) 
and then the sum over all rsR. 

N-1 

1 \ 

n«0 rsR 

Hence , 


} P. (r)d(r) 
r sR 


2o4 



PAej ,3 
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*. ■ 1 f.<'i'*<'> ' I 

r<S rsR 

Tlio aoproxincte fitting problaa is to dstsraino coefficients 
K 1 N-1 such that 

E 

- } [d(.) - } 

reR n>0 

is minimized. To find tke velue of some coefficient* say a . 

m 

2 

take the partial derivative of both sides of the equation for e 

with respect to a'. Set it to zero and use the orthogonality 

m 

property to find that again 

5 P^(r) (3) 

m ^ m ^ m 

rsR reR 

The exact fitting coefficients and the least squares coefficients 
are identical for m * O**..*!. 

Fitting the data value; {d(r)lrsR} to the polynomial 
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K 

• 1 

n-0 

nov poraits us to interpret Q(r) at a veil behaved real-valaed 
function defined on the real line* To determine 

dQ 


ve need only to evaluate 


N 


dr 


n«0 


In this manner, any derivative at any point may be obtained. 
Similarly for any definite integrals. Beaudet (1978) uses this 
technique for estimating derivatives employed in rctationally 
invariant image operators* 

It should be noted that the kernel used to estimate a 
derivative depends on the neighborhood size, the order of the 
fit, and the basis functions used for the fit* Figure 3 
illustrates one example of the difference the assumed model 
makes. This difference means that the model used must be 
justified* the justification being that it is a good fit to the 
deta. In particular, a not sufficiently good justification for 
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using first order models is that first order partial derivatives 
are being estimated. 
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Assumed Model 


g(r,c)««oo + *io>'+»01® 


g(r,c)* *0O'*’*lOr'*'*01® 

+ + 422^®^' 2/3)r 


Kernel Mask for Row Derivative 


+ + + + 

1-1 1-1 1-1 I 

+ + + + 

1/6 I 0 I 0 I 0 I 

+ + + + 

I 1 I 1 I 1 I 

+ + + + 


+ ^ + 

I 0 1-1 I 0 I 

+ + + + 

1/2 I 0 1 0 I 0 ! 

+ + + + 

I 0 I 1 I 0 I 

+ + + 


Figure 3 illustrates that the assumed model does make a 
difference in the kernel mask used to estimate a quantity such as 
row derivative • 


268 



OR!G:r^Al ^ • j 

OF POOR QUALITY 


III. The Directional Derivative Edae Finder 

We denote the directional derivative of f at the point (r*c) 
in the direction a by f^'(r,c). It is defined aa 


f ( r-t^hsina, c+hcosa) - f(r,c) 


f (r,c) = lim ( 4 ) 

“ i->0 h 


The direction angle a is the clockwise angle from the column 
axis. It follows directly from this definition that 


9 

f (r,c) *il(r,c) sina + M(r,c) cosa (5) 

^ 3r 3c 

We denote the second directional derivative of f at the 

9 9 

point (r>c) in the direction a by f^ (r,c) and it quickly 
follows that 


,, d fsin 

f « ___ 

“ ar^ 


2 2 2 

23 fsina cosa ^ 3 fees a 

2 

3r c 3c 



(6) 

Taking 

f to 

be a cubic polynomial in 

r and c 

which can be 

eat ima ted 

by 

the discrete orthogonal 

po 1 ynom i a 1 

f it t ing 

procedure , 

we 

can compute the gradient of 

f and 

the 

gradient 
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diroctioa angle at the center of the neighborhood nsed t<' 
estiaate f. Letting f be estimated as a two diaensional cnbi 


f (r,c) 


- ki + kjr + kjo 

k^r^ + kjtc + kgC^ 
+ k^r^ + 


* ^ 10 ^ 


3 


(7) 


we obtain a by 

Sinn - k,/(k,^ + k,^) 

, , , ( 8 ) 
COSO » * ^3 ^ 

At any point (r«c), the second directional derivative in the 
direction a is given by 


f ( r , c ) 
a 


2 2 
(6k^ sin a ^ 4kg sinacosa ^ 2k^ c a)r (9) 

2 2 
+ (^^10 ® ^ cosa 2kg sin o)c 

2 2 
^ (2k^ sin a + 2k^ sina cosa ^ 2k^ cos a) 


We wish to only consider points (r,c) on the line in 
direction a. Henccg r«psina and c«pcosa. Then 
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3 2 

6[k^sin a + kgSin a cosa 

2 ^ 

+ k^sina cos a + k^^cos a]p 
2 


(10) 


2„ 4 . V 3 

a + k^^cos < 

2 2 
+ ^^^4 sin <1 kj siaa cosa + k^ cos a] 


Ap + B 


II ' ' ' 

If for sons p, Ipl < p^. f (p) “ 0 snd f (p) # 0 we have 

u a CL 

discovered a zero-crossing of tke second directional derivative 
taken in the direction of the gradient and we mark the center 
pixel of the neighborhood as an edge pixel. 


IV. Statistical Analvs is 

In this section we show how the randomness of the noise 
induces a randomness in the least squares coefficients and then 
how the randomness of the least squares coefficients induces a 
randomness in the estimated gradient value^ the estimated angle 
of the gradient, and the estimated location of the zero-crossing. 


IV. 1 General Model 

We let p . n«l,...,N denote the names of the discrete 

n 

orthonormal basis functions, i\ denote the independent and 
identically distributed noise, and g denote the gray tone 
intensity function* Under this model, the observed image can be 
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written &s 


N 

g(r,c) « ^ Pj^(r,c) + n(r,c) 

n«l 


( 11 ) 


where 


5 Pa(r.c) P„(r.c) 
r pC 


[ 0 , ni^n 
1 n»m 


and the least squares estimates for the unknown 

coefficients a , . . . , a^^are given by 


for 


*n “ 5 Pj^(r.c) 

r , c 

Substituting the formula for g(r«c) into 
a^ an/> simplifying results in 

• ^ ^ Pj^(r,c) q(r,c) 


( 12 ) 


the equation 


(13) 


r , c 

clearl> showing that a^ has a de t e rmi> i ; t ic part and a random 

part, the randomness being due to the noise. We assume that the 

2 

noise is independent normal having mean 0 and variance a * 

Therefore, the estimated coefficient a* has mean a , variance 

n n 

2 

a 4nd is ancorrel 4t«d with every other coefficient: 


m 



Gn!CJi)!Ai PAi":: iz 


® ^*41 * •« 

& n 

^ t*4 »;i = »■»»' ^ ^ 

E li'^l » a^+ <T^ 
n n 

V (.;]. 

Tke rcsidaal error e is defined as the difference between 
the observed valaes and fitted values* It too is a random 
variable * 

N 

e(r,c) * g(r,c) * ^ (14) 

n=^l 


N 

“ 5 ^*n" *n^ Pjjtr.c) + ii(t.c) 
n=l 

It i* not difficult to see that at each (r.c), the residual error 

has mean lero and xs uncorrelated with each estimated coefficient 

a' since 
n 

E la' «(r.c)i = 0 
n 

After some algebraic substitutions and a.anipul at ion » the total 

2 

tesidual error* S * can be written as 
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N 

J e^(r.c) - J n^(r,c) - J (»j^- (13) 

r*c r.c a>l 

Tbas. if tlie noise is assumed normal and there are K pixels in a 
window 


5 n^<*»c)/ Las xj. 

r , c 

A chi-sqaar ed variate with K degree of freedom^ 


N 

l 1 ... 

n»l 

T 2 2 

which maxes 2 ^ (r,c) have 

r ,c 


IV. 2 Esliwa t ina the F irst Par t ial s 

If the discrete orthogonal basis functions are polynomials 
then each first partial derivative at (0.0) in the row and column 
directions is given as some linear c omb inatiou of the 'stimated 
coefficients. Furthermore^ tne linear combination for the row 
partial will be orthogonal to the linear combination in the 
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CO laan partial* Letting the coefficients of the linear 
combination for the row partial be and the coefficients 
of the linear combination for the colnan partial be t^,,..,tj^, 
where 


N N 



n»l n*l 


we have, 

N 

•‘r " 5 *n ‘n 
n»l 

N 

»*c “ 5 ‘n *n 

n»l 

as the true but unknown values of the row and column partials. 
The estimates are 
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N 

“ 5 *n •; 

n='l 

N 

**; • 5 ‘n •; 

n*l 

tnd they hmve aean and variance given by 

E i»i;i - 
E tn^] = 

V tu^] - o^k 

V [^'] - o^k 

£ * li II 

Hence* the estiaates for tbe row and coluan partial derivatives 
are ancorrelated. 
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j / . 3 Hypo the s i $ Te s t in^ For Zero Gradient 

To see the effect of the randonness on the estimate of 
the gradient magnitude, consider testing the hypothesis that 
^ * 0 . This hypothesis must be rejected if there is 

o be a zero-crossing of second directional derivative. Under 
his hypothesis. 



2 

has a distribution. 

The total residual error normalized by the noise variance, 
2 ^ 2 

S /a"^, has a distribution. Hence 

i G 

k S^/(K-N) 

h.ss a r, distribution and the hypothesis of 0 

would be rejected for suitably large values. 

IV. 4 nf idence Interval for Grad ient Direction 

To see the effect of the randomness on the estimate of the 
di erection of the gradient, consider the relationships portrayed 
in figure The axes are the row and column partials and 

. The direction angle 0 of the gradient is given by 
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cos e - 
sin e * 

C t C 


(16) 


The center of the circle is nt the estiaste (|t',|i'). Upon 

X c 

substituting the estimates §i* and u* for u and u , we 

r c r c 

obtain the estimated direction angle 6* by 


CCS e* • n’/in'l* 
sin e* - n;/(p'j+ 


(17) 


From a Bayesian point of view, the area of the circle 
represents the conditional probability that the unknown 
lies within a distance R from the observed 

r c 

given that the variance of |i* and is known 

r c r c 

2 

and equal to kcr • Assuming a normal distribution for the 

2 2 

noise, this conditional probability is q - l~c ^ ^ Hence, 

if probability q is given, the corresponding radius R is 


k a [-2 log(l-q)] 


1/2 


( 18 ) 


To determine a confidence interval for 0 of the form 
0' - 0 A, we have from figure 4 that 


sin A * 


k 0-'(-2 log(l-q)) 
r c 


(19) 
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Note that the 2A confidence interval length depends on the 

probability <; of the circle confidence region for (p ,\i ) 

r c 

2 2 

and the unknown noise variance c . Although o is not 

2 2 2 

known« we do know S which has aw X dis tr ibnt ion« We can 

2 

handle the problea of the unknown a by determining a joint 

2 

confidence region for (p ) and a ( Font x » 1981 ) * Taking 

r c 

p to be the probability that a chi-squared random variable with 

2 

X-N degrees of freedom has an observed value greater than X , 

n-N • p 

we have the confidence interval (0* S /X^- ^ ] for having 

L-W, p 

2 

at least probability p. Replacing c in equation (19) 

2 2 

by S /X ^ - we obtain 


sin^A 


k S‘‘(-2 


X'N.p 


lo| ( 1-q) ) 


( 20 ) 


A confidence interval fox 6 bavins at least probability pq is then 
(O'- A. e*+ A) . 
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Figure 4 illustrates the geonetry of the coufideuce interval 
estimation for the edge angle« 
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Edge Hvpothes 1$ Te s t I hr 

In this section we first take the edge direction a to be a 
fixed constant. We let and |i^ be the expected values of 
the random variables A and B appearing in equation (10) • The 
null hypothesis is that an edge exists. The null hypothesis is 
satisfied if for some p, o ^ p ^ d, p ^ Un ^ 0 . 

** A D 

The observed random variables are k, B« and the residual 

2 

fitting error S . The bivariate random variable 


A 

B 


is normal having mean 



covar iance 




where k^ and k^ are known constants. For a window of K pixels 

2 2 2 

and a cubic fit, S /a has a 

From this it follows that 


Z(Ha.Hb) 


S^/(K-10) 


has an j_iq distribution. 

We define R “ {(x,y)lfor some p. o ^ p i d, xp + y « 

0) Then the null hypothesis is rejected at the p significance 
level if 


■ in 






A"*B' 


is larger than 
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An edge strength probability can be defined by q where q 
satisfies 


■ in 


Z(liA.HB) 


^2 k-10 q 


Of course the edge direction a is not fixed. Bnt we do 
have a confidence interval for it« And for each value of a in 
the confidence interval* the random variable A(a) and B(a) 
can be computed and the null hypothesis tested. If for all a 
in the confidence interval the null hypothesis is rejected* then 
the existence of an edge is also rejected. 

In practice* we can perform a non-exact hypothesis test 
selecting only the left end* middle* and right end values of a 
from its confidence interval. If for each of these three values 
of a the null hypothesis is rejected* then the existence of an 
edge is also rejected. 


V. Experimental Results 

To understand the performance of the second directional 
derivative zero-cross ing digital step edge operator we examine 
its behavior on a well structured simulated data set and on a 
real aerial image. For the simulated data set* we use a 100x100 
pixel image of a checkerboard* the checks being 20x20 pixels. 
The dark checks have gray tone intensity 75 and the light checks 
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have gray tone intensity 175« To this perfect checkerboard we 
add independent Gaussian noise having mean zero and standard 
deviation 50. Defining the signal to noise ratio as 10 times the 
logarithm of the range of signal divided by RMS of the noise« the 
simulated image has a 3 db signal to noise ratio. The perfect 
and noisy checkerboards are shown in figure 5. 

Section V.l illustrates the performance ot the classic 3x3 
edge operators with and without pteaveraging compared against the 
generalized Prewitt operator. Section V.2 illustrates the 
performance of the Mar r-Hildr e th zero-crossing of Laplacian 
operator, the 11x11 Prewitt operator, and the 11x11 zero-crossing 
of second directional derivative operator. The zero-crossing of 
second directional derivative surpasses the performance of the 
other two on the twofold basis of probability of correct 
assignment and error distance which is defined as the average 
distance to closest, true edge pixel of pixels which are assigned 
non-edge but which are true edge pixels. 

V . 1 T Classic Edge Operator s 

The classic 3x3 gradient operators all perform badly as 
shown in figure 6. Note that the usual definition of the Roberts 
operator has been modified in the natural way so that it uses a 
3x3 mask. 

Averaging before the application of the gradient operator is 
considered to be the cure for such bad performance on noisy 


2tt3 




Figure 5 illustrates the noisy checkerboard used in 
experiments. Low intensity is 75 high intensity is 

Standard deviation of noise is 50* 


the 
175 . 
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images (Rosenfeld and Kak« 1976), Figure 6 also shows the same 
operators applied after a box filtering with a 3x3, 5x5, and 7x7 
neighborhood sizes* 

An alternative to the preaveraging i*i to define the gradient 
operator with a larger w dew* This is easily done with the 
Prewitt operator (Prewitt ,1970) which fits a quadratic surface in 
every window and uses the square root of the sum of the squares 
of tl coefficients of the linear terms to estimate the gradient* 
(A linear fit actually yields the same result for the polynomial 
basic function, A cubic fit is the first higher order fit which 
would yield a different result,) This is illustrated in figure 
7, A 3x3 pre-average followed by a 3x3 gradient operator yields 
a resulting neighborhood size of 5x5, Thus in figure 7 we also 
show the 3x3 preaverage followed by a 3x3 gradient under the 5x5 
Prewitt and we show the 5x5 pre-average followed by the 3x3 
gradient under the 7x7 Prewitt. The noise Is higher in the pre- 
average edge-detector. For comparison purposes the 5x5 Nevatia 
and Babu (1979) compass operator is shown alongside the 55 
Prewitt in figure 8. They give virtually the same result. The 
Prewitt operator has the advantage of requiring half the 
comput a t ion . 

It is obvious from these results that good gradient 
operators must have larger neighborhood sizes than 3x3. 
Unfortunately, the larger neighborhood sizes also yield thicker 
edges . 
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Viguzt 7 illasttates the Prewitt Opertvor don<» *>y osiag a 
$<}a»res quadratic fit ia the neighborhood versus 
preaveragiBg and using a ssaUet fitting neighborhood size, 
no preaveraging results show slightly higher contrast. 
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Figure $ compares the Neva ti a and Babat compass opera*- 
Prewitt operator in a Sx5 neighborhood. 
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To detect ed^es> the gradient value must be thr c shol ded . In 
each case, ve chose a threshold value which makes the conditional 
probability of assigning an edge given that there is an edge 
equal to the conditional probability of there being a true edge 
given that an edge is assigned. True edges are established by 
defining them to be the two pixel wide region in which each pixel 
neighbors some pixel having a value differert from it on the 
perfect checkerboard. Figure 9 shows the thresholded Prewitt 
operator (quadratic fit) for a variety of neighborhood sizes. 
Notice that because the gradient is zero at the saddle points 
(the corner where tonr checks meet), any operator depending on 
the gradient to detect an edge will have trouble there. 


V . 2 The Second Derivat ive Ze ro Crossing Edge Ope r a t o r s 

Marr and Cildreth (1980) suggest an edge operator based on 
tb«i zero crossing of a generalized Lapltcian. In eff^^ct, this is 
non-- directional or isotropic second derivative zero crossing 
operator. The mask for this generalized Laplacian operator is 



at row colum coordinates (r,c) designating the center of each 
pii.el position in the neighborhood and then setting the value k 
so that the sum of the resulting weights is zero. Edges arc 
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Figare 9 illastrates the edges obtaiaed by thresholdlag the 
results of the Prewitt operator. 
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detected at all pixels vhose generalized Laplacian value is of 
one sign and one of vliose neiglibors has a generalized Laplacian 
value of the opposite sign. A zero-crossing threshold strength 
can be introduced here by insisting that the difference between 
the positive valne and the negative value must exceed the 
threshol«i value before the pixel is declared to be an pixel. 

Figure 10 illustrates the edge images produced by this technique 
for a variety of threshold values and a variety of values for o 
for an 11 by 11 window. It is apparent that if all edge pixels 
are to be detected* there will be many pixels declared to be edge 
pixels which are really not edge pixels. And if there are to be 
no pixels which are to be declared edge pixels which are not edg 
pixels* then there will be many edge pixels which are not 
detected. Its performance is poorer than the Prewitt operator. 

The directional second derivative zero crossing edge 
operator introduced in this paper is shown in figure 11 for a 
variety of gradient threshold values. If the gradient exceeds 
the threshold value and a zero-crossing occurs in a direction of 
± 14.9 degrees of the gradient direction within a circle of one 
pixel length centered in the pixel* then the pixel is declared to 
be an edge pixel. This technique performs the worst at the 

saddle points* the corner where four checks meet because of these 
being a zero gradient there. 

Table 1 shows the comparison among the Prewitt operator and 
the directional and the Ma r r-Hi 1 dr e t h non-d i r e c t iona 1 second 
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Figaire 10 i 1 1 si s t r « t « s 
Hildreth zero-cfossirtg 
different zero- crossing 
deviations for the associated Blexican hat 


e edges obtained by the 11 x 
of Laplaciaa operator set f 
thresholds and three different 

filter. 



Figure 11 illustrates the directional derivative edge operator 
for 4 different thresholds* 
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derivative zero crossing edge operators. The threshold usei is, 
as before^ the one equalizing the conditional probability of 
assigned edge given true edge and the conditional probability of 
true edge given assigned edge. It is clear that the performance 
of the directional derivative operator is better than the ?rewitt 
operator and the Mar r-Hildr e th operator^ both on the basis of the 
correct assignment probability and the error distance which is 
the average distance to closest true edge pixels of pixels which 
are assigned non-edge labels but which are true edge pixels. 

Figure 12 shows the corresponding edge images of the 11x11 
Prewitt operator usinj a cubic fit rather than a quadratic fit, 
the 11x11 Marr-Hildr e th operator, and the 11x11 directional 
derivative zero-crossing operator. The thresholds used are the 
ones to equalize the conditional probabilities as given in Table 
1. A visual evaluation also leaves the impression that the 
directional deiivative operator produces better edge continuity 
and has less noise than the other two. 
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For the case of constant variance additive noise» 
thresholding on the basis of the hypothesis test of section IV.3 
yields essentially the same results as simply thresholding the 
gradient value • 

Figure 13 illustrates the second directional derivative zero 
crossing operator on an aerial image which has been median 
filtered and then enhanced by replacing each pixel with the 
closer of its 3x3 neighborhood minimum or maximum. The technique 
is so good that it is possible to determine region boundaries 
essentially by doing a connected components on non-edge pixels. 
Figure 13b shows the cleaned edge image which is obtained by 
doing a connected components on tLe non edge pixels, then 
removing all pixels whose region has fewer than 20 pixels. The 
resulting boundaries are given as pixels which have a neighbor 
with a different label than its own. 

Initial raw edges which leave gaps in a region boundary will 
in effect make the regions merge in the connected components 
step. Thus the small number of missing boundaries is surprising. 
To be sure, we are not advocating connected components as an 
image segmentation technique. The fact that it works as well as 
it does is an indication of the strength of the edge detector. 
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Figure 13b illustrates the directional derivative edges obtained 
from the aerial photograph by first 3x: median filtering, then 
replacing each pixel by the closer of its 3x3 neighborhood minimuiR 
or maximum, then taking tue directional derivative edges using a 
7x7 window, then doing a connected components on the non-edge 
pixels , and removing all regions having fewer than 20 pixels, and 
then displaying any pixel neighboring a pixel different than it 
as an edge pixel. 
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Prewitt 

Marr-Hildreth 

Directional Derivative 

Parameters 

Gradient 

Threshold =* 18.5 

Zero-crossing 
Strength 4.0 

« - 5.0 

Gradient 
Thre shold«14 • 0 
p * . 5 

P(AEItE) 

.6738 

.3977 

.7207 

P(TEiAE) 

.6872 

.4159 

.7197 

Error Distance 1.79 

1.76 

1.16 


Table 1 compares tbe performance of three edge operators using an 
11x11 window on the noisy checkerboard image. Threshclds are chosen 
to equalize^ as best as possible, P(AEiT£), the conditional 
probability of assigned edge given true edge and the conditional 
probability, P(T£|A£) of true edge given assigned edge. The error 
distance is the average distance to closest true edge pixels of pixels 
which are assigned non-edge but which are true edge. 
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VI • Cone lu8 io ns 

We iiave argued that numeric digital image operations should 
be explained in terms of their actions on the underlying gray 
tone intensity surface of which the digital image is an observed 
noisy sample. We called this model, the facet model for digital 
image processing and showed how the facet model can be used to 
estimate in each neighborhood the underlying gray tone intensity 
surface • 

We described a digital step edge operator which detects 
edges at all pixels whose estimated second directional derivative 
taken in the direction of the gradient has a zero crossing within 
the pixel's area. We discussed the statistical analysis of ^.his 
technique, illustrating how to determine confiuence intervals for 
the direction of the gradient and how this 1 determin i a 
confidence interval for the placement of the zerc-crossing 

le have compared the performance of the directional 
derivative zero crossing edge operator with that of the classic 
edge operators, the generalized Prewitt gradient operator, and 
the Marr-Hildre th zero crossing edge operator. We found that in 
both the simulated and real image data sets the directional 
derivative zero crossing ed^e operator had superior perf rmanc.. 


We 

have illustrated 

that for 

good 

performance it 

i s 

iapor tent 

to nse 

larger neighborhood 

sizes 

than 3x3 

and 

have 

shown that better 

r e sul t s 

are achieved by 

de f in ing 

the 

cdf e 

operator 

na tur ally 

in the 

large neighborhood 

rather 

than 

pr e- 
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averaging and tlien using a snaller :ie ighborliood edge operator on 
the averaged image. 

There is much work yet to be done. We need to explore the 
relationship of basis function kind, (polynomials trignometric 
polynomial etc.), order of fit, and neighborhood size to the 
goodness of fit. Evaluation must be made of the confidence 
intervals produced by the technique. The technique needs to be 
generalized so that it works on saddle points created by two 
edges crossing. A suitable edge linking method needs to be 
developed which uses these confidence intervals. Ways of 
incorporating semantic information and ways of using variable 
resolution need to be developed. An analogous technique for roof 
edges needs to be developed. We hope to explore these issues in 
future papers . 
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